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ABSTRACT 

We use the new ultra-deep, near-infrared imaging of the Hubble Ultra-Deep Field (HUDF) 
provided by our UDF12 Hubble Space Telescope (HST) WFC3/IR campaign to explore the 
rest-frame ultraviolet (UV) properties of galaxies at redshifts z > 6.5. We present the first 
unbiased measurement of the average UV power-law index, (0), (fx oc A* 3 ) for faint galaxies 
at z ~ 7, the first meaningful measurements of (f3) at z ~ 8, and tentative estimates for a 
new sample of galaxies at z ~ 9. Utilising galaxy selection in the new F140W ( J140) imaging 
to minimize colour bias, and applying both colour and power-law estimators of /3, we find 
(f3) = —2.1 ± 0.2 at z ~ 7 for galaxies with Mjjv ~ —18. This means that the faintest 
galaxies uncovered at this epoch have, on average, UV colours no more extreme than those 
displayed by the bluest star-forming galaxies at low redshift. At z ~ 8 we find a similar 
value, (f3) = —1.9 ± 0.3. At z ~ 9, we find (/?) = —1.8 ± 0.6, essentially unchanged from 
z ~ 6 — 7 (albeit highly uncertain). Finally, we show that there is as yet no evidence for a 
significant intrinsic scatter in f3 within our new, robust z ~ 7 galaxy sample. Our results are 
most easily explained by a population of steadily star-forming galaxies with either ~ solar 
metallicity and zero dust, or moderately sub-solar (~ 10 — 20%) metallicity with modest dust 
obscuration (Ay — 0.1 — 0.2). This latter interpretation is consistent with the predictions of a 
state-of-the-art galaxy-formation simulation, which also suggests that a significant population 
of very-low metallicity, dust-free galaxies with /3 ~ —2.5 may not emerge until Muv > —16, 
a regime likely to remain inaccessible until the James Webb Space Telescope. 

Key words: galaxies: high-redshift - galaxies: evolution - galaxies: formation - galaxies: 
stellar populations - cosmology: reionization 



1 INTRODUCTION 

The revolution in very-deep, near-infrared imaging provided by the 
2009 refurbishment of the Hubble Space Telescope (HST) with the 
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Wide Field Camera 3 (WFC3/IR) has enabled the discovery and 
study of the first substantial samples of galaxies at z > 6.5 (see, 
e.g., Dunlop 2012 for a review). Following the instant success of 
the initial deep Y W5 , J 125 , H 160 UDF09 imaging (GO 11563; PI: 
G. Illingworth) of the Hubble Ultra-Deep Field (HUDF; Beckwith 
et al. 2006) and associated parallel fields (e.g. Oesch et al. 2010; 
Bouwens et al. 2010a, 201 1; McLure et al. 2010, 201 1; Finkelstein 
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et al. 2010; Bunker et al. 2010), WFC3/IR has been used to conduct 
wider-area surveys for more luminous galaxies at z = 7 — 8, both 
through the CANDELS Treasury programme (Grogin et al. 2011; 
Koekemoer et al. 2011; Grazian et al. 2012; Oesch et al. 2012), 
and through parallel imaging programmes such as the BoRG sur- 
vey (e.g. Bradley et al. 2012). Most recently, attention has been 
refocussed on pushing to even fainter magnitudes and still higher 
redshifts, either with the assistance of gravitational lensing (e.g. 
through the CLASH Treasury Programme; Zheng et al. 2012; Coe 
et al. 2012), or through our own ultra-deep WFC3/IR imaging in 
the HUDF (GO 12498; PI: R. Ellis, hereafter UDF12). 

Our recently-completed 128-orbit UDF12 observations reach 
5-a detection limits of Y105 = 30.0, J125 = 29.5, J140 = 29.5, -ffi60 
= 29.5 (after combination with the UDF09 data), and are the deep- 
est near-infrared images ever taken (Ellis et al. 2012). A detailed 
description of the UDF12 data-set is provided by Koekemoer et al. 
(2012), and the final reduced images will be available on the team 
web-pag^E 

These new, ultra-deep, multi-band near-infrared images have 
already yielded the first significant sample of galaxies at z ~ 9, 
including a possible candidate at z ~ 12 (Ellis et al. 2012). The 
discovery of galaxies at 2 > 8.5 was a key design goal of this 
programme, and motivated the first inclusion of deep J140 imaging 
in the HUDF (elsewhere, J140 imaging has also played a key role 
in enabling CLASH to yield convincing galaxy candidates out to 
z ~ 10.7; Coe et al. 2012). However, the additional deeper Hieo 
imaging, and the ultra-deep Y105 imaging has also been crucial in 
enabling the more robust selection of objects at 2 ~ 7 and 2 ~ 8 
(with improved photometric redshifts; McLure et al. 2012), and a 
push to still fainter magnitudes. Another key goal of the UDF12 
programme was therefore to use the resulting improved samples, 
coupled with the more accurate 4-band near-infrared photometry, 
to undertake a new and unbiased study of the rest-frame ultraviolet 
(UV) spectral energy distributions (SEDs) of faint galaxies at 2 > 
6.5. 

This paper is thus focussed on revisiting the study of the rest- 
frame UV SEDs of galaxies, and in particular their UV contin- 
uum slopes, /3 (where f\ oc A* 9 ; e.g. Calzetti, Kinney & Storchi- 
Bergmann 1994; Meurer et al. 1999), armed with the best-available, 
near-infrared data required for this measurement at z ~ 7, 8 (and, 
for the first time, at 2 ~ 9). Because the objects uncovered by HST 
in the HUDF at these redshifts are too faint for informative near- 
infrared spectroscopy with current facilities, a broad-band determi- 
nation of the UV continuum slope /3 at present offers the only prac- 
tical way of gaining insight into the rest-frame UV properties of 
the early populations of galaxies emerging in the young Universe. 
This (in principle simple, but in practice tricky) measurement is of 
astrophysical interest for a number of reasons. 

First, certainly at more modest redshifts, ft has been shown 
to be a good tracer of dust extinction in galaxies, as it has been 
demonstrated to be well correlated with excess far-infrared emis- 
sion from dust (e.g. Meurer et al. 1999; Reddy et al. 2012; Heinis 
et al. 2012). The reason this works is that (as we again demon- 
strate later in this paper) an actively star-forming galaxy of ~ solar 
metallicity would be expected to display f3 ~ — 2 (= zero colour in 
the AB magnitude system) in the absence of dust, and so any sig- 
nificant deviation to redder values can be viewed as a signature of 
significant dust extinction (albeit the relation between far-infrared 
dust-emission UV-derived dust emission is not expected to be per- 
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feet, given that different regions of the galaxies might be observed 
in such widely-separated wavelength regimes; e.g. Wilkins et al. 
2012; Gonzalez-Perez et al. 2012). These results have been used to 
interpret the apparent steady progress towards lower average values 
of (ft) with increasing redshift (from 2 ~ 4 to z ~ 7) in terms of 
monotically-decreasing average dust extinction, with important im- 
plications for the inferred cosmological evolution of star-formation 
activity (e.g. Hathi et al. 2008; Bouwens et al. 2009, 2012; Castel- 
lano et al. 2012; Finkelstein et al. 2012). 

Second, /3 is obviously also a function of age, although (as 
we explicitly demonstrate later in this paper) the sensitivity is not 
very strong for young, quasi-continuously star-forming sources, 
and very blue values of f3 can certainly only be achieved for very 
young stellar populations. 

Third, f3 can be used as an indicator of metallicity. In practice 
of course the impact of metallicity and dust extinction can be de- 
generate for redder values of /3, but blue values significantly lower 
than /3 ~ — 2 are an indicator of a low-metallicity stellar popula- 
tion. For example NGC1703, one of the bluest local star-forming 
galaxies with f3 ~ —2.3, is generally intepreted as being dust-free, 
with a significantly sub-solar metallicity (Calzetti et al. 1994), as is 
the low-mass galaxy BX418 at z ~ 2.3 for which Erb et al. (2010) 
report ,5 = —2.1, E(B — V) ~ 0.02, and Z ~ 1/6 Z . 

Fourth, /? is influenced by the extent to which the emission 
from the combined photospheres of the stars in a galaxy is 'con- 
taminated' by nebular continuum. Nebular continuum emission is 
significantly redder than the star-light from a very young, low- 
metallicity stellar population (e.g. Leitherer & Heckman 1995), 
and so, given other information (e.g. Stark et al. 2012; Labbe et 
al. 2012) or assumptions, /3 can in principle be used to estimate (or 
correct for) the level of nebular emission in a young galaxy. This 
in turn can set constraints on the inferred escape-fraction (/ esc ) of 
Hydrogen-ionizing photons. It is the rate-density of such photons 
that requires to be estimated to chart the expected progress of cos- 
mic reionization by young galaxies (e.g. Robertson et al. 2010), 
but such photons are not directly observable during the epoch of 
reionization. 

Thus, as discussed by many authors, there is a strong motiva- 
tion for attempting to measure ft as accurately as possible, but the 
interpretation of the results can clearly be problematic given the de- 
generacies involved. Interestingly, however, the degree of compli- 
cation in interpretation is result-dependent. In particular, as high- 
lighted by Schaerer (2002) and Bouwens et al. (2010b), the dis- 
covery of extremely blue values of /3 ~ —3 would offer a fairly 
clean and powerful result, because such values can only be pro- 
duced by a stellar population which is simultaneously very young, 
of extremely low metallicity, dust-free, and also free of significant 
nebular emission (corresponding to a very high escape fraction for 
ionizing photons). Since these are exactly the combined properties 
which might be expected of the first galaxies (which possibly com- 
menced the reionization of the Universe; e.g. Paardekooper, Khoch- 
far & Dalla Vecchia 2012; Mitra, Ferrara & Choudhury 2012) the 
measurement of /3 during the first billion years of cosmic time has 
been a key focus of several recent studies of galaxies at 2 ~ 7 
(Bouwens et al. 2010b, 2012; Dunlop et al. 2012; Finkelstein et 
al. 2010, 2012; McLure et al. 2011; Wilkins et al. 2011; Rogers, 
Dunlop & McLure 2012). 

However, as discussed in detail by Dunlop et al. (2012) and 
Rogers et al. (2012), previous attempts to determine the UV spec- 
tral slope at faint magnitudes (Mjjv > —19) have inevitably been 
afflicted by bias. The interested reader is referred to these papers for 
detailed discussions and simulations, but there are three key points 
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to consider, each related to photometric scatter and the resulting 
impact on derived average values ((/?)). 

First, the selection band can bias the result if source selection 
is pushed to the 5-a limit, and the primary selection band is then 
also involved in colour determination. For example, imposition of a 
J125 flux density threshold, as applied by Bouwens et al. (2010b), 
inevitably leads to a blue-bias in the derived average value of (j3) 
if (3 is based on J125 — Z/ieo colour. 

Second, the classification of objects as robust high-redshift 
Lyman-break galaxies can also yield a subtle bias towards bluer 
values of /3. Again this is only really an issue for the derivation 
of average values from individual measurements with substantial 
photometric scatter. The point is that, while both colour-colour se- 
lection (e.g. Bouwens et al. 2010a, 2011) and photometric redshift 
selection (e.g. McLure et al. 2010, 2011) are sufficiently inclusive 
to include virtually all plausible star-forming galaxy SEDs at high 
redshift, photometric scatter can lead to some genuine high-redshift 
galaxies being misclassified as being at much lower redshift. Be- 
cause this only happens when the scatter yields erroneously red 
colours, the result can be clipping of the red end of the observed 
colour distribution, yielding a blue bias in the derived average (/3) . 
Rogers et al. (2012) show that this bias is, unsurprisingly, essen- 
tially identical for colour selection and photometric-redshift se- 
lection provided all galaxies with a plausible high-redshift photo- 
metric solution are retained in the latter approach. If attention is 
confined to the most robust photometrically-selected high-redshift 
galaxies, then the bias is unfortunately inevitably more extreme 
(because a very blue colour longward of the putative Lyman break 
essentially guarantees a robust high-redshift solution; see Dunlop 
et al. 2012). 

Third, if, as advocated by Finkelstein et al. (2012), (3 is de- 
rived from galaxy spectral energy distribution (SED) models (e.g. 
Bruzual & Chariot 2003; hereafter BC03), the result can be a red 
bias in (j3). At first sight, the use of galaxy SEDs seems sensi- 
ble, but the problem is that model galaxy SEDs never produce j3 
significantly bluer than f3 ~ —3. While there is good reason to be- 
lieve that real galaxy SEDs can never actually yield j3 signficanfly 
bluer than j3 ~ —3, if one wants to compute a population average 
from a photometrically-scattered set of objects, it is (as already em- 
phasized) important not to artificially clip one end of the observed 
distribution. In this case the effect of insisting on fitting plausible 
galaxy SEDs is to clip the blue end of the distribution, because 
any object which displays, say, f3 = —5 will be corrected back to 
f3 = —3 (or whatever the most extreme /3 contained in the galaxy 
SED library happens to be). The result is a red bias in the average 
(13). Thus, as explicitly demonstrated by Rogers et al. (2012), the 
most robust way to determine an unbiased value of (f3) is via a pure 
power-law fit to the appropriate photometry. 

The primary aim of this paper is not to revisit these bias issues 
but, rather, to use the new, deep, multi-band near-infrared photom- 
etry provided by the UDF12 WFC3/IR imaging campaign to avoid 
them, and deliver the first straightforward, unbiased measurement 
of (f3) for faint galaxies at 2 ~ 7 (comparable to, but somewhat 
fainter than the luminosity regime where (f3) ~ —3 was originally 
claimed by Bouwens et al. 2010b). The UDF12 campaign was, in 
part, designed with this goal in mind. First, the increase in depth 
and the addition of an extra passband J140 allows significantly 
more accurate measurements of UV slope down to Muv — —17. 
Second, the introduction of the deep J140 imaging allows object 
selection to be based primarily on a band which has minimal influ- 
ence on derived UV slope (whether derived by J125 — -ffieo colour, 
or by power-law fitting through J125, J140, -H"i6o)- 



A second aim of this paper is then to exploit both the new 
UDF12 photometry, and the new 2 ~ 8 and z ~ 9 galaxies un- 
covered by McLure et al. (2012) and Schenker et al. (2012) in our 
UDF12 programme (Ellis et al. 2012; Koekemoer et al. 2012), to 
present the first meaningful measurements of (/3) at 2 ~ 8 and 
z ~ 9. Inevitably these first results on (3 at even higher redshift ap- 
ply to somewhat brighter absolute magnitudes (Muv — —18) than 
the faintest bin explored at 2 ~ 7, but nevertheless it is of interest 
to explore the behaviour of j3 back to within ~ 0.5 Gyr of the big 
bang. Measurements of f3 at even earlier epochs will not be possi- 
ble until the launch of the James Webb Space Telescope (JWST) and 
the advent of ground-based Extremely Large Telescopes (ELTs). 

Finally, a third aim of this paper is to attempt to move beyond 
the determination of (/3) at 2 ~ 7, and explore whether the im- 
proved accuracy of individual measurements of j3 afforded by the 
UDF12 data provide any evidence for a significant intrinsic scatter 
in j3 at 2 ~ 7. The impact of UDF12 on the evidence for a colour- 
magnitude relation, and its potential evolution over a broader red- 
shift range 2 ~ 4 to 2 ~ 7 will be considered in a separate paper 
(Rogers et al., in preparation). 

This paper is structured as follows. In Section 2 we briefly 
summarize the new UDF12+UDF09 dataset, and the way in which 
our new high-redshift galaxy samples spanning the redshift range 
6.5 < 2 < 12 have been selected. Then, in Section 3 we present 
straightforward (but now essentially unbiased) 'traditional' colour 
measurements of f3 at z ~ 7 (to aid comparison with previous 
studies) and also for the first time at 2 ~ 8 and 2 ~ 9. In Sec- 
tion 4 we then present our 'best' measurements of j3 based on the 
multi-band power-law fitting as developed and advocated in Rogers 
et al. (2012). Here we also draw on the results of our simulations 
to demonstrate the absence of any substantial bias in our measure- 
ments, and to correct for any minor residual effects. We consider 
the astrophysical implications of our results in Section 5, includ- 
ing a comparison with the predictions of the latest hydrodynamical 
models of galaxy evolution. Finally our conclusions are summa- 
rized in Section 6. All magnitudes are quoted in the AB system 
(Oke 1974) and any cosmological calculations assume Qm = 0.3, 
fi A = 0.7, andffo = 70kms~ 1 Mpc~ 1 . 



2 GALAXY SAMPLES 

2.1 UDF12 high-redshift galaxy sample selection 

New HUDF galaxy samples were selected from the final 
UDF12+UDF09 dataset as follows. First, SExtractor (Bertin & 
Arnouts 1996) was used to select any source which yielded a > 5-a 
detection in any one of the final single-band WFC3/IR Y105, J125, 
J140 or -H160 images, or in any contiguous stacked combination 
of them (i.e. Y105+J125, J125+J140, Ji40+-Hi60, Y105+J125+J140, 
J125+./140+-H16O, Yio5+Ji2&+Ji40+Hi6o)- The catalogues from 
these ten alternative detection runs were next merged to form a par- 
ent sample which was then culled by rejecting any source which 
showed a > 2-o detection in any of the three shortest-wavelength 
(-B435, V 60 6, «775) deep HST ACS optical images of the HUDF. 
This process means that the resulting galaxy sample should remain 
complete beyond 2 = 6.4 (although it will also yield some galaxies 
in the redshift range 2 ~ 6 - 6.4; see McLure et al. 201 1, 2012). 

Multi-band aperture photometry was then performed at the po- 
sition of each object (as determined from the detection image which 
yielded the highest signaknoise ratio detection), using 'matched' 
circular apertures designed to contain 70% of the flux density 
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from a point source. The appropriate aperture diameters as used 
are 0.5 arcsec at Hiqo, 0.47arcsec at J140, 0.44arcsec at J125, 
0.40arcsec at F105, and 0.3 arcsec for the ACS zsso photometry. 
The WFC3/IR and zsso photometry was then all aperture-corrected 
to 82% of total to correctly match the photometric limits achieved 
for point sources in the bluer ACS bands within a 0.3-arcsec di- 
ameter aperture. In addition, Spitzer IRAC photometry at 3.6 and 
4.5 nm was included for each source. This was based on a new 
deconfusion analysis of the deepest available IRAC HUDF imag- 
ing (Labbe et al. 2012) using the method described by McLure et 
al. (2011), and the final UDF12 Hieo image as the best available 
template for the galaxies to be fitted to the sky as seen in the highly- 
confused IRAC imaging. In practice, for all but a few sources, this 
yielded formal non-detections. 

Photometric redshifts (with associated probability distribu- 
tions) were then derived on the basis of this 10-band photometry, 
using the method described in McLure et al. (201 1). Specifically, a 
wide range of galaxy SEDs from the BC03 models were utilised, 
limited only by insisting that galaxy ages were younger than the age 
of the Universe at each redshift) and dust extinction was allowed 
to float as high as Av = 4. All flux-density measurements were 
utilised in this model fitting, even in bands where sources were un- 
detected (including negative flux-density where measured, thus en- 
suring a consistent derivation of \ 2 ). 

On the basis of the SED fitting, the sample was further refined 
by retaining only those sources which displayed a statistically- 
acceptable solution at 2 > 6.4 (i.e. redshift solutions with a for- 
mally acceptable value of \ 2 , in practice \ 2 < 15). At this stage 
all remaining candidates were visually inspected, and rejected from 
the catalogue if they lay too near the perimeter of the imaging, or 
too close to bright sources for reliable photometry (a cull that is re- 
flected in the effective survey areas utilised by McLure et al. 2012 
in the luminosity function analysis presented therein). A final vi- 
sual check was also performed to remove any object which yielded 
a significant detection in a smoothed, stacked i?435+V606+i775 
pseudo broad-band optical ACS image (in order to further mini- 
mize the number of low-redshift galaxy contaminants). 

Finally, as in Dunlop et al. (2012), all of the 
ACS+WFC3/IR+IRAC SED fits were inspected, and the sources 
classified as ROBUST or UNCLEAR depending on whether 
the secondary, low-redshift solution could be excluded at >2-a 
significance, as judged by A\ 2 > 4 between the secondary and 
primary redshift solution. The final result of this process is a 
sample of 146 sources, of which 97 are labelled as ROBUST 
and 49 are UNCLEAR. Absolute rest-frame UV magnitudes at 
A res t ~ 1500 A(Mi5oo) have been calculated for all objects 
by integrating the spectral energy distribution of the best-fitting 
evolutionary synthesis model through a synthetic 'narrow-band' 
filter of rest-frame width 100 A (see McLure et al. 201 1). 

2.2 Further sample refinement for j3 analysis 

For the specific purpose of the UV continuum slope analysis pre- 
sented in this paper, the sample was then split into three redshift 
bins, yielding 1 16 galaxies at z ~ 7 (6.4 < z < 7.5), 24 galaxies at 
z ~ 8 (7.5 < 2 < 8.5) and 6 galaxies at 2 ~ 9 (8.5 < 2 < 10). Fi- 
nally, to minimize any bias in j3 introduced at the galaxy-selection 
stage, we decided to limit the final galaxy catalogues at 2 ~ 7 and 
2 ~ 8 to those objects which yielded a >5-a detection in the J140 
band alone (i.e. Jno < 29.5 in the 0.47-arcsec diameter aperture). 
This reduced the number of galaxies in each sub-sample to 45 at 
2 ~ 7 and 12 at 2 ~ 8, but still allows us to probe galaxy lumi- 



nosities down to Muv — —17 at 2 ~ 7. This also means that we 
are here exploiting the extra depth of the additional UDF12 Y105 
and Hieo imaging primarily for the determination of more accu- 
rate measurements of /3, rather than to push the detection threshold 
to the absolute limit. This somewhat conservative approach has the 
beneficial side-effect of reducing the fraction of UNCLEAR galax- 
ies in the final sample to only ~ 10%, minimizing the need to con- 
sider the differences between analyses based on total or ROBUST- 
only samples. 

All derived numbers, plots, and simulations presented here- 
after assume the application of this J140 threshold at 2 ~ 7 and 
2 ~ 8. However, at 2 ~ 9, application of this threshold leaves 
only 2 objects, and so we do not apply it. In any case, as outlined 
in Section 3.2, by 2 ~ 9, J140 is the bluest band involved in the 
j3 estimation, and so its usefulness as an unbiased band for galaxy 
selection disappears. For this and other obvious reasons, the results 
we present on (j3) at 2 ~ 9 are treated separately, and should be re- 
garded only as tentative/indicative (as compared to the statistically- 
robust results we now provide at 2 ~ 7 and 2 ~ 8). 



3 TWO-BAND MEASUREMENT OF UV SLOPES 

The standard convention is to characterise the rest-frame UV con- 
tinuum slope via a power-law index, /3, where fx oc A* 3 . For objects 
at 2 ~ 7, the Yios-band photometry could, in principle, be contam- 
inated by Lyman-a emission-line flux, and so it has been common 
practice to limit the measurement of ft at z > 6.5 to an estimator 
based on J125 — Hieo colour (e.g. Bouwens et al. 2010b; Dunlop 
et al. 2012). Now, with the addition of the new J140 imaging from 
the UDF12 campaign, a three-band (J125, Juo, -H160) power-law 
fit can be performed at 2 ~ 7. As quantified in Rogers et al. (2012), 
a three-band power-law fit is in fact the optimal way to determine 
f3 for galaxies in this redshift regime, and so we apply this method 
for the first time at these redshifts in Section 4. However, to facili- 
tate comparison with previous work, and to see the direct impact of 
our deeper photometry and our J140 galaxy selection on the results, 
we first perform the standard J125 — Hieo colour estimation of (3 
on our new UDF12 J140 filtered sample. Moreover, at 2 ~ 9, only 
J140 and Hieo are capable of sampling the continuum longward of 
A res t ~ 1215 A, and so /3 has to be based upon J140 — -H16O colour, 
rendering power-law fitting once again essentially redundant. 

The effective wavelengths of the filters of interest in 
this study are J , i2 5 :A off =12486A, Ji 4 o:A off =13923A, and 
fli6o:A e ff=15369A. We note that these are the 'pivot' wave- 
lengths, incorporating not only the filter transmission profiles, but 
also the full throughput of WFC3/IR including detector sensitivity 
as a function of wavelength. There are of course a number of 
definitions of 'effective wavelength' for broad-band filters, but the 
'pivot' wavelength is the appropriate one for the present study, and 
in any case agrees with the alternative 'mean' (source independent) 
wavelength of the filter to within better than 1% (see, for example, 
Tokunaga & Vacca 2005). 

Adopting the above effective wavelengths, the appropriate 
equations for converting from near-infrared colour to j3 are simply 

P = 4.43( Ji2B - H 160 ) - 2 (1) 

for measurements at 2 ~ 7 — 8 based on J125 — Hieo colour, and 

P = 9.32(Ji4 - Hieo ) - 2 (2) 

for measurements at 2 > 8.5 which have to be based purely on 
J140 — Hieo- 
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Figure 1. Individual measurements of UV continuum slope, f3, at 2 ~ 7 
(upper panel, blue points) and at 2 ~ 8 (lower panel, green points) for 
the galaxies in the new UDF12 samples (as detailed in Section 2.2) plotted 
versus their UV absolute magnitudes (Muv,AB = -Misuu)- The values of 
ft shown here are derived from the UDF12 data using J125 — i^ieo colours 
as described in Section 3. The average values, along with standard errors in 
the mean, are plotted (in black) for each 1 -magnitude wide luminosity bin 
which contains > 5 sources (see also Table 1). The UDF12 galaxy samples 
used have been confined to those objects which are detected at > 5cr in the 
Ji40-band, in order to minimize colour bias in the selection process ( J140 
photometry is not used here in the determination of 0). To help provide 
dynamic range, the samples at Muv,AB < — 19 have been supplemented 
with > 8-0- objects from the UDF09P1 and UDF09P2 parallel WFC3/ACS 
fields (17 objects at 2 ~ 7; 5 objects at z ~ 8). Errors for individual (3 
measurements are not shown, simply because the typical error can be judged 
directly from the scatter in the plot (which, it transpires, is effectively all due 
to photometric error; see Section 4.3). 



As already noted by Dunlop et al. (2012), equation (1) differs 
very slightly from the relation adopted by Bouwens et al. (2010b), 
which is /3 = 4.29( J125 — -ffi6o) — 2 (presumably due to the adop- 
tion of slightly different effective wavelengths). However, the dif- 
ferences in derived values of j3 are completely insignificant in the 
current context (e.g. for J125 — -ffi60 = —0.2, the Bouwens et al. 
relation yields f} = —2.86, while equation (1) yields /3 = —2.89). 

Finally, we caution that equation (2) must be regarded with 
some scepticism. First, the relatively short wavelength-baseline 




M UV,AB 

Figure 2. Individual measurements of UV continuum slope, j3, (red points) 
as derived from our UDF12 data using J140 — Hieo colour for the 6 galax- 
ies at z ~ 9 (8.5 < z < 10; Ellis et al. 2012), plotted against their UV 
absolute magnitudes (Mjjv AB = Misoo)- The average value, (0), at 
Mjjy ~ —18 is indicated by the black point, with the error-bar corre- 
sponding to the standard error in the mean (see Table 1). This is the first 
attempt at such a measurement at this redshift, and the J140 — -ffieo colour 
does not span a very large wavelength baseline. Moreover, with such a small 
sample at 2 ~ 9, the statistical average is clearly not very robust. Never- 
theless, the available information suggests that (/3) at Mjjv,ab = —18 at 
2 ~ 9 has not changed dramatically from 2 ~ 7, and is still consistent with 
/3 = -2. 



provided by J140 — -Hieo colour is reflected in the rather large co- 
efficient by which colour (and hence also uncertainties in colour) 
must be multiplied to yield an estimate of f3. Second, whereas J125 
and Him provide independent samples of a galaxy SED, the J140 
and Hieo bands overlap, and hence the resulting measurements are 
inevitably correlated to some extent. For both these reasons equa- 
tion (1) should be utilised rather than equation (2) whenever pos- 
sible. Nevertheless, out of curiosity, in Section 3.2 below we apply 
equation (2) to the 6 objects in the 8.5 < z < 10 sample to ob- 
tain a first direct observational estimate of (/?) in this previously 
unexplored redshift regime. 

3.1 Robust measurements at z ~ 7 and z ~ 8 

In Fig. 1 we show the results of our new J125 — -H160 colour-based 
determinations of j3 for the galaxies in the new UDF12 samples 
at z ~ 7 and z ~ 8, plotted versus their UV absolute magnitudes 
(Muv,ab = M1500). In this figure we also plot the average values, 
along with standard errors in the mean, for each 1 -magnitude wide 
luminosity bin which contains > 5 sources. These values are tab- 
ulated in Table 1. We emphasize that, in order to minimize colour 
bias in the selection process, the UDF12 galaxy samples used have 
been confined to those objects which are detected at > 5a in the 
Ji40-band as described in Section 2.2. Moreover, to further mini- 
mize bias, both ROBUST and UNCLEAR objects were retained in 
evaluating these average values of ft (see Rogers et al. 2012), but in 
practice the J140 cut ensures that virtually all objects are ROBUST, 
and rejection of the 5 UNCLEAR objects at z ~ 7, and the sole 
UNCLEAR object at z ~ 8 does not significantly change these 
results. 
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Table 1. Derived average UV continuum slopes, (f3), and standard errors as 
a function of absolute UV magnitude (in bins with AMuv = 1 mag) and 
redshift. The values given in column two are derived from simple two-band 
colours as described in Section 3, and are only tabulated for luminosity bins 
that contain > 5 galaxies from the UDF12 and/or UDF09 Parallel Fields 
(see Figs 1 and 2). The values given in columns three and four are derived 
using the power-law fitting technique described in Section 4, and have been 
corrected for small residual biases as evaluated from the results of the end- 
to-end source injection, retrieval, and measurement simulations detailed in 
Section 4.1 (see Figs 3 and 4). The galaxy samples analysed at z ~ 7 and 
z ~ 8 were restricted to objects detected at > 5-<x in J140 to minimize 
colour-selection bias, as described in Section 2.2. The galaxy sample at 
z ~ 9 was not restricted in this way (only 2 out of the 6 objects would 
remain). We also note that, to further minimize bias, both ROBUST and 
UNCLEAR objects were retained in evaluating these average values of /3, 
but in practice the J140 cut ensures that virtually all objects are ROBUST, 
and rejection of the 5 UNCLEAR objects at z ~ 7, and the sole UNCLEAR 
object at 2 ~ 8 does not significantly change these results. Finally, we note 
that the Muv bin centres quoted here refer to magnitudes based on 82% 
of enclosed flux density for a point source as detailed in Section 2.1, and 
are thus ~ 0.2 mag fainter than presumed total Mjjy for unresolved (or 
marginally-resolved) sources. 



M uv 


(/?> (J - H) 
Mean 


(13) (Power-law) 
Mean 


(/3) (Power-law) 
Weighted Mean 


z ~ 7 


-19.5 
-18.5 
-17.5 


-1.72±0.12 
-2.23±0.16 
-2.02±0.29 


-1.82±0.11 
-2.08±0.15 
-2.08±0.26 


-1.94±0.11 
-2.08±0.15 
-2.03±0.26 


z ~ 8 


-19.5 
-18.5 


-1.98±0.27 
-1.96±0.27 


-1.80±0.22 
-1.75±0.26 


-1.84±0.22 
-1.75±0.26 


2 ~ 9 


-18 


-1.80±0.63 







Our success in, for the first time, essentially eliminating any 
significant colour bias from these measurements at 2 ~ 7 and 
2 ~ 8 is further confirmed below in Section 4, by the power-law 
P determinations and associated end-to-end data simulations. Our 
new, robust results at 2 ~ 7 confirm and extend the main conclu- 
sion of Dunlop et al. (2012), that there is, as yet, no evidence for 
UV continua significantly bluer than /3 ~ —2 in the currently de- 
tectable galaxy population at 2 ~ 7. 

The results presented here at z ~ 8 represent the first, mean- 
ingful and unbiased measurement of (f3) for a significant sample of 
galaxies at this even earlier epoch, but we cannot probe to such faint 
absolute magnitudes as at z ~ 7. Nevertheless, over the available 
dynamic range —20 < Muv < 18 we clearly see no evidence for 
any significant change from z ~ 7, with the average UV continuum 
slope (f3) again consistent with f3 = —2 (the z ~ 8 measurement 
remains more inaccurate simply due to smaller sample size). 



3.2 Preliminary measurements at z ~ 9 

In Fig. 2 we show the results of our attempt to determine f} for the 6 
new galaxies we have uncovered in the HUDF at 8.5 < 2 < 10, as 
reported by Ellis et al. (2012). These measurements are necessarily 



based on J140 — Hieo colour, and the individual values of (3 thus 
derived are indicated by the red points in Fig. 2. The scatter is very 
large, as expected given the photometric errors and the limitations 
of equation (2) already discussed above. Nevertheless, since all of 
these objects have absolute UV magnitudes in the range —18.5 < 
Muv < —17.5 we have proceeded to calculate the average (j3) for 
a single bin centred at Muv — —18. This is shown by the black 
point, which corresponds to (f3) — —1.80 ± 0.63 (where the error 
is the standard error in the mean). 

Such a measurement has not previously been possible at this 
redshift, and this first effort clearly yields a highly-uncertain result 
which should not be over-interpreted. Nevertheless, the current data 
provide no obvious evidence that (/3) at Muv = — 18 has changed 
dramatically between 2 ~ 7 and 2 ~ 9, with the average value still 
fully consistent with j3 = —2. 



4 POWER-LAW MEASUREMENTS AND DATA 
ANALYSIS SIMULATIONS 

We now proceed to determine j3 using the power-law fitting method 
as explored and optimized in Rogers et al. (2012). We have also 
performed a set of end-to-end data-analysis simulations, starting 
with the injection of sources into the real UDF12 images, in order to 
quantify any remaining residual bias in our derived average values 
of (/3). We first describe these simulations, before proceeding to 
summarize the results. 

4.1 Source injection, retrieval and measurement simulations 

Our simulations begin by defining a distribution of UV slopes. For 
the present study we adopt a delta function at (3 — —2 as our ref- 
erence model, but also consider 'top hat' distributions of various 
widths, as discussed below in Section 4.3. 

Next, we create an input catalogue of galaxies with f3 values 
drawn from the defined distribution, redshifts in the range 6 < 2 < 
9, and absolute magnitudes spanning —22 < Muv < —16 (with 
the relative number density of objects at different magnitudes gov- 
erned by the 2 = 7 luminosity function of McLure et al. 2010). A 
model SED is then created for each galaxy, incorporating the in- 
trinsic colour, the IGM attenuation of flux blueward of the Lyman 
break, the redshifting of the spectrum into the observed frame, and 
then cosmological dimming. Empirical PSFs are then created with 
broad-band flux-densities based on the model SEDs (in practice, 
the PSFs are set to zero in the B435 , V606 , 1775 bands where the 
flux is entirely attenuated). The PSFs are then inserted into the real 
multi-wavelength UDF12 images, avoiding existing bright sources 
and regions of high rms noise where real candidates would be dis- 
carded. 

Objects are then reclaimed using SExtractor; we accept only 
objects lying within 2-pixels of an input PSF centre, and then per- 
form aperture photometry on these objects in exactly the same way 
as for the real galaxies. Photometric redshifts are then obtained us- 
ing Le Phare (Arnouts et al. 1999; Ilbert et al. 2006) with the same 
BC03 models used in the real data analysis. We adopt an identical 
selection function to that used for the real data, and measure ab- 
solute magnitudes with the same synthetic filter on the best-fitting 
BC03 model. Following Rogers et al. (2012), UV continuum slopes 
are obtained by performing a power-law fit to the J125, J140, #160 
flux densities (Yios , J125 , #ieo in the parallel fields, accounting 
for partial attenuation of the K-band by the Lyman break where 
appropriate). 
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4.2 Results at z ~ 7 

In Fig. 3 we show the results of our power-law analysis for the 
galaxies at z ~ 7, split into the same three luminosity bins as in 
Fig. 1, and this time, for completeness, showing results for both the 
FULL (upper row) and ROBUST-source only (lower row) samples. 
The grid of power-law models fitted to the WFC3/IR photometry 
extends over a deliberately very large range, —8 < ft < 5, to 
ensure that the observed /3 distribution is not artifically truncated. 
However, in practice (primarily due to the J140 significance cut) 
the sample studied here has photometry of sufficient quality that no 
object yields a measured /3 < —4. 

The grey histograms in Fig. 3 show the distribution of the 
power-law derived /3 values in each bin, and the small squares with 
error bars indicate the average (/}) values (and associated standard 
errors). The results derived from the real data are compared here 
with the results from our reference simulation in which every fake 
galaxy is assigned f3 — — 2 before being inserted into the UDF12 
imaging; the red histograms indicate the distribution of power-law 
ft values retrieved from the simulations, with the red points indi- 
cating the corresponding average and standard error in each lumi- 
nosity bin. The red points thus offer a measure of the bias in our 
measurements of average (j3) which can be seen to be negligible 
for both the FULL and ROBUST samples. As expected, it can be 
seen that confining the sample to ROBUST sources results in the 
removal of a few of the redder galaxies in the faintest magnitude 
bin, but because the number of UNCLEAR sources is so small, the 
results are essentially unchanged (especially when measured rela- 
tive to simulation expectation, which also moves slightly blueward 
in the ROBUST-source simulations). 

The final values given for the power-law determination of {/3) 
at z ~ 7 in Table 1 are taken from the FULL-sample analysis 
shown in the upper row of Fig. 3, and are calculated relative to 
the simulated values (to correct for any small residual bias). As 
long as the appropriate correction is applied, the results are essen- 
tially identical if they are derived from the ROBUST-source only 
analysis presented in the lower row. Within the errors, all three lu- 
minosity bins at 2 ~ 7 are clearly consistent with f3 = —2, with a 
best-estimate of (/?) = —2.1 in the fainter two bins. Reassuringly, 
the power-law estimates are fully consistent with the J125 — -H160 
colour-based measurements presented in Section 3 (see Table 1 for 
details and errors). 



4.3 Results at z ~ 8 

In Fig. 4 we show our power-law /3 determinations at z ~ 8. The 
values derived from the real data are again shown by the grey his- 
tograms, with the average and standard error indicated by the black 
squares with error bars. Similarly, the corresponding results for the 
j3 = —2 simulation are indicated in red. At z ~ 8 the J140 signifi- 
cance threshold leaves only two galaxies fainter than Muv = — 18 
so, as in Fig. 1, we limit our analysis to the two brighter bins. The 
samples are smaller, and so the corresponding random errors are 
larger, but again it can be seen that the values of (/}) derived from 
the data are consistent with ft = —2 in both luminosity bins, and 
the blue bias implied from the simulations is relatively modest (al- 
though it is slightly larger if only ROBUST objects are retained, as 
expected). 

As at z ~ 7, the final results for the power-law determination 
of (/3) at 2 ~ 8 given in Table 1 are derived from the FULL-sample 
analysis shown in the upper row of Fig. 4, calculated relative to the 
simulated values. Again, within the errors, both luminosity bins at 
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Figure 3. The distribution of individual power-law /3 measurements at z ~ 
7, along with average values, (/?) (and standard errors), plotted against UV 
absolute magnitude. Results are shown for all sources (upper row), and for 
ROBUST sources only (lower row). Simulations shown in red are based on 
2000 galaxies inserted with /9 = —2. The data from UDF12 are shown in 
grey/black. The data in the brightest two bins have been supplemented with 
a few sources from the two UDF09 parallel fields as discussed in the text 
and in the caption to Fig. 1 . 
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Figure 4. The distribution of individual power-law j3 measurements at 
z ~ 8, along with average values, (/3) (and standard errors) plotted against 
UV absolute magnitude. Results are shown for all sources (upper row), and 
for ROBUST sources only (lower row). Simulations shown in red are based 
on 2000 galaxies inserted with f3 = —2. The data from UDF12 are shown 
in grey/black. The faintest bin shown for the 2 ~ 7 sources in Fig. 3 only 
contains two sources in our Ji4o-thresholded 2 ~ 8 sample, and so we do 
not attempt to show results at Myy — — 17.5 here. The data in the brighter 
bin have been supplemented with a few sources from the two UDF09 par- 
allel fields, as discussed in the text and in the caption to Fig. 1. 
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z ~ 8 are clearly consistent with /3 = — 2, and the power-law 
estimates are fully consistent with the J125 — -ffieo colour-based 
measurements presented in Section 3 (see Table 1). 



4.4 Trends with z, Muv, and evidence for scatter 

Our derived average values of (/3) reveal no evidence for a signif- 
icant trend with redshift over the limited redshift range explored 
here, 7 < z < 9. However, it must be remembered that we are cur- 
rently unable to explore the faintest absolute magnitude bin studied 
at z ~ 7 at higher redshift. 

Our results also do not yield any significant evidence for a 
relation between (/3) and Muv at a given redshift, although again 
the available dynamic range is limited, and a full exploration of this 
issue is deferred to a future paper including results from brighter 
larger-area surveys. 

The one suggestive result that does merit additional scrutiny 
here is the apparent excess scatter seen in f3 at the faintest absolute 
magnitudes probed at z ~ 7. Specifically, in the lowest-luminosity 
bin plotted in Fig. 3 (z ~ 7, Muv > —18) it appears that our sim- 
ulation (red line in the histogram) does not replicate the observed 
scatter in /3 (grey region) as successfully as at brighter magnitudes. 
It is of interest to attempt to quantify the statistical significance of 
this effect, as a growth in the intrinsic scatter in /3 with decreasing 
luminosity might be expected if, for example, the faintest galaxy 
samples begin to include a significant number of very young, metal- 
poor objects. 

To do this we have expanded our simulations (beyond a single 
value of f3 = —2) to explore a variety of intrinsic /3 distributions. 
In particular, we considered alternative top-hat distributions for the 
input values of /3 in order to assess whether a wider intrinsic dis- 
tribution can provide a significantly improved fit to the data in this 
faintest bin. To determine the statistical significance of our results, 
we used a K-S test, as illustrated in the comparison of the simu- 
lated and observed cumulative /? distributions presented in Fig. 5. 
For clarity we restrict Fig. 5 to only 3 alternative input models, al- 
though for consistency we again show results for the full sample, 
and for ROBUST sources only. From the K-S test significance val- 
ues given in Fig. 5 it can be seen that the f3 = —2 simulation in 
fact continues to provide a perfectly acceptable description of the 
data. Unsurprisingly, a wider intrinsic distribution can provide an 
improved fit, although the highest significance values are achieved 
if this distribution remains centred on a value close to /3 = —2 
(consistent with our results for (/3)). Thus, while it is clear that we 
cannot rule out the possibility that our galaxy sample contains some 
objects with UV slopes as blue as /? ~ 3, the current data certainly 
do not require any significant intrinsic scatter (even in this well- 
populated luminosity bin). For now, therefore, reliable conclusions 
can only be drawn on the basis of population-averaged values, {ft). 
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Figure 5. A comparison of the distribution of /3 values derived for the real 
galaxies in the faintest luminosity bin probed here at z ~ 7 (—18 < 
Muv < —17), with those predicted by alternative models based on differ- 
ent assumed intrinsic distributions of (3. The upper panel shows all sources, 
while the lower panel contains ROBUST sources only. The grey regions 
show the cumulative distributions of f) as derived from the data, while the 
coloured lines show the mock cumulative distributions as produced by the 
output from each alternative simulation. The significance (p) values for each 
model (under the null hypothesis that the real and simulated distributions 
are drawn from the same underlying distribution), are given in the top-left 
corner of each panel. Arrows show where the maximum deviation between 
the data and each simulation occurs, with the length of the arrow equal to 
the deviation. 



5 DISCUSSION 

5.1 Comparison with previous results 

At z ~ 7 our results can be compared with the recent work of 
Bouwens et al. (2012) and Finkelstein et al. (2012), and with our 
own previous study in Dunlop et al. (2012) which (as with the in- 
vestigations by Bouwens et al. 2010b, Finkelstein et al. 2010, and 
Wilkins et al. 2011) was based purely on the portion of UDF09 
WFC3/IR imaging obtained prior to the end of 2009. 

Based on the complete UDF09 dataset, at z ~ 7, Bouwens et 



al. (2012) reported a measurement for (/?) at Muv — —18.25 of 
(/3) = -2.68 ± 0.19 ± 0.28, with the first error representing the 
random error, and the second the estimated systematic uncertainty 
(albeit presumably, in practice, not symmetric). This measurement 
is larger (redder) than the original Bouwens et al. (2010b) measure- 
ment of —3.0 ± 0.2 in the same luminosity bin, with the change 
being due to the availability of the final UDF09 dataset, improved 
assessment of bias, and (possibly) the removal of the J125 flux- 
density threshold in the galaxy-selection process (see Rogers et al. 
2012). Clearly, however, the Bouwens et al. (2012) result still re- 
mains bluer than that reported here at comparable absolute magni- 
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tudes, albeit the two can be reconciled within the errors (especially 
if the estimated systematic error is applied to the red, moving (/?) 
to ~ —2.40). However, the Bouwens et al. (2012) results could 
easily still be biased blue in the faintest bin, as they did not have 
the advantage of the deeper photometry from UDF12 exploited 
here, nor could the J140 significance threshold be applied to as- 
sist in the selection of unbiased, secure sub-samples of galaxies. 
Circumstantial evidence that the faintest measurement of (/3) of 
Bouwens et al. (2012) remains biased to the blue is offered by the 
fact that the (/J) value for their next brightest bin is much redder, at 
(/?) = —2.15 ± 0.12 ± 0.28, in excellent agreement with our own 
results at comparable magnitudes. 

Finkelstein et al. (2012) have also reported a move to redder 
values of (/3) for the faintest galaxies at z ~ 7 as found in the final 
UDF09 dataset compared to the original measurements made by 
Finkelstein et al. (2010). Specifically, Finkelstein et al. (2010) re- 
ported (/?) = — 3. 07±0. 51, while Finkelstein et al. (2012) reported 
a median value of (/3) = — 2.68lg'24 which becomes ~ —2.45 af- 
ter bias correction. Given the errors, clearly this result, while still 
somewhat bluer, can be reconciled with our own, now more accu- 
rate measurements. 

In Dunlop et al. (2012) we aimed to highlight the dangers of 
the potential for blue bias in the early measurements of made 
in the immediate aftermath of the first discovery of faint z ~ 7 
galaxies with WFC3/IR. We utilised only the first epoch of the 
UDF09 dataset, and confined our attention to > 8-<r sources, and 
therefore did not report a robust result for absolute magnitudes 
as faint as Muv — —18.5 at z ~ 7. We did, however, report 
{P) = -2.12 ± 0.13 at M uv ^ -19.5 for z ~ 7 galaxies (in 
good agreement with the results from Bouwens et al. and Finkel- 
stein et al. discussed above), and found {/?) = —2.14 ± 0.16 at 
Muv — —18.5 for z ~ 5 — 6. Clearly these results are consis- 
tent with the values in the corresponding luminosity bins presented 
here at z ~ 7, confirming the apparent stability of (/?) for the UV- 
selected population in this high-redshift regime (including the lack 
of any obvious redshift or luminosity dependence). 

Finally, we note that Finkelstein et al. (2012) did attempt a 
measurement of (/?) at z ~ 8, and found —2.00 ± 0.32, although 
this preliminary measurement was not deemed trustworthy enough 
for inclusion in the abstract (in part because it was substantially 
redder than at z ~ 7). This result is in fact in very good agreement 
with the new, more robust determination at z ~ 8 presented here, 
and indeed seems less surprising given our final results at z ~ 7. 

5.2 Physical interpretation 

Although we cannot exclude some intrinsic scatter in /3, the anal- 
ysis presented in Section 4.4 shows that we have, as yet, no ev- 
idence for it. What is clear is that our derived values of aver- 
age {/?) ~ —2.1 ± 0.2 at z ~ 7 clearly exclude the possibility 
that a large subset of galaxies in our sample have extreme val- 
ues j3 ~ —3, as anticipated from very young, very low-metallicity, 
dust-free stellar populations. 

To illustrate this, and further explore the consequences of our 
measurements for the inferred physical properties of the currently- 
detectable galaxies, we show in Fig. 6 how our results compare with 
the values expected from stellar populations of different metallicity, 
nebular emission (related to ionizing escape fraction), dust redden- 
ing, and age (up to the age of the Universe at z ~ 7). In this figure 
our basic, most secure result at z ~ 7 and Muv — —18 (which is 
also consistent with our results at z ~ 8 and z ~ 9) is indicated by 
the horizontal dotted line, while the l-cr uncertainty (standard error) 
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Figure 6. A comparison of our observed average value of (/?) for galax- 
ies with Muv — — 18 at z ~ 7 with the predicted age-dependence of 
from alternative stellar population models with and without nebular emis- 
sion (with age only plotted up to the age of the Universe at z ~ 7). In both 
panels our most robust data point is indicated by the horizontal dotted line, 
with the l-cr uncertainty (standard error) shown by the surrounding grey 
shaded band. In the upper panel the solid lines are derived from instanta- 
neous starburst models, with metallicities of Zq (red), 0.2Zq (green), and 
0.02Zq (blue), assuming zero nebular emission (i.e. f esc = 1). The dashed 
lines are produced by adding nebular continuum emission self-consistently 
(Robertson et al. 2010), assuming the extreme case f esc = (see text for 
details). In the lower panel we show constant star-formation models, again 
with and without nebular emission, but this time only for Zq (red) and 
0.2Zq (green) models. The dark-green curves show the effect of adding 
modest dust obscuration/reddening to the 0.2Zq (brighter green) model, 
(= Av — 0.1 for SNII dust extinction, or = Av — 0.2 for the extinction 
law of Calzetti et al. 2000) 



is shown by the surrounding grey shaded band (which acceptable 
models should therefore intercept at plausible ages). 

The predictions of f3 as a function of age shown in Fig. 6 have 
been produced using the BC03 evolutionary models. Nebular con- 
tinuum emission has been added to the stellar-population templates 
self-consistently, based on the flux of Hydrogen-ionising photons 
predicted from each BC03 model (using the code developed by 
Robertson et al. 2010). The nebular continuum includes the emis- 
sion of free-free and free-bound emission by H, neutral He and 
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singly-ionised He, as well as the two-photon continuum of H (see 
the prescription given in Schaerer 2002). 

In the upper panel of Fig. 6 we plot three alternative instanta- 
neous starburst models, with metallicities equal to the solar value 
Zq (red), 0.2Zq (green), and O.O2Z (blue). For each model we 
show the pure stellar prediction (i.e. zero nebular emission = an 
ionizing photon escape fraction of unity, f esc = 1) as a solid line, 
and the extreme alternative of maximum nebular contribution (= 
zero ionizing photon escape fraction, f esc — 0) by the dashed line 
of the same colour. Not surprisingly for these burst models, the im- 
pact of the nebular continuum becomes negligible after ~ 10 Myr. 
During the time period when it is significant, its impact is more 
pronounced the lower the adopted metallicity (see also figure 4 in 
Bouwens et al. 2010b). 

From Fig. 6 we infer that our results are inconsistent with 
very young and very low metallicity models. Moreover, while 
P ~ —2.1 can be produced by essentially any metallicity at a 
carefully-selected age, the speed with which the UV continuum 
reddens with age, coupled with the homogeneity of our results, ar- 
gues strongly that the burst models illustrated in the upper panel 
(and in Bouwens et al. 2010b) are inappropriate, and in any case 
physically unrealistic. 

A more natural assumption is that the galaxies selected by 
our rest-frame UV-selection technique are forming stars quasi- 
continuously (at least on average, especially at these early times). 
Therefore, in the lower panel of Fig. 6 we show predictions for 
constant star-formation models. Again we show the Zq (red) and 
0.2Zq (green) models, but this time (for clarity) omit the 0.02Z© 
model because, without dust-reddening, the 0.2Zq model is al- 
ready too blue until an age of ~ 1 Gyr (and the inferred trend 
with further reduction in metallicity is clear). Instead, we add a 
second version of the Q.2Zq model with modest dust obscura- 
tion/reddening (= Ay — 0.1 for SNII dust extinction, or = Av — 
0.2 for the extinction law of Calzetti et al. 2000). This last curve (in 
dark green) illustrates the degeneracy between dust extinction and 
the assumed metallicity of the stellar population. 

Clearly, these continuously star-forming models provide a 
much more plausible explanation of our results, and are capable 
of delivering the observed homogeneous value of /3 without any 
requirement for fine tuning in age (i.e. j3 is little changed over the 
relevant timescale, ~ 10 Myr to ~ 100 Myr). One is then left to 
choose between solar metallicity with very little room for any ad- 
ditional dust reddening, or moderately sub-solar metallicity stellar 
populations coupled with modest dust reddening. The degeneracies 
are clear, but the latter scenario is arguably more plausible, and in 
fact happens to correspond well with the physical properties pre- 
dicted for the currently observable galaxies (i.e. Muv < —17) 
from the cosmological galaxy-formation simulation discussed be- 
low. Finally, we note that while, as expected, the contribution from 
nebular emission persists to much longer times in these continually 
star-forming models, even the extremes adopted here of f e3C — 1 
and f esc = yield unobservably small differences in observed 
P. Thus, if quasi-continous star-forming galaxies of modestly sub- 
solar metallicity are the correct interpretation of our results, there 
is currently no realistic prospect of estimating f esc from measure- 
ments of the UV continuum slope at z ~ 7. 



5.3 Comparison with galaxy-formation model predictions 

It is instructive to compare our findings with the predictions of 
a state-of-the-art cosmological galaxy-formation simulation. The 
simulation used here has been recently described, and its basic ob- 
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Figure 7. Comparison of our most accurate (bias-corrected power-law) 
measurements of average as a function of Mr/y at z ~ 7 with the 
predicted f) values for individual galaxies as derived from the lOMpc cos- 
mological galaxy-formation simulation described in Section 5.3 (see also 
Dayal et al. 2012). The data plotted here are given in column 4 of Table 1. 
The model predictions include the effects of dust, and therefore correspond 
to the predicted observed values of (5. 



servational predictions (e.g. luminosity function, mass function) 
verified by Dayal et al. (2012). Interested readers are referred to 
Maio et al. (2007, 2009, 2010) and Campisi et al. (2011) for com- 
plete details of the simulation, but the key details can be summa- 
rized as follows. 

The simulation has been carried out using the TreePM-SPH 
code GADGET-2 (Springel 2005), within the ACDM cosmology 
given in Section 1, and assuming a baryon density parameter Qb = 
0.04, a primordial spectral index n s = 1 and a spectral normalisa- 
tion erg = 0.9. The periodic simulation box has a comoving size of 
10/i -1 Mpc and contains 320 3 DM particles and, initially, an equal 
number of gas particles. The masses of the gas and DM particles 
are 3 x 10 5 /i -1 M Q and 2 x lO 6 /^ 1 M , respectively. 

The code includes the molecular chemistry of 13 primordial 
species: e", H, H~, He, He+, He ++ , H 2 , H 4 ", D, D+, HD, HeH+ 
(Yoshida et al. 2003; Maio et al. 2007, 2009), PopIII and PopII/I 
star formation according to the corresponding initial mass function 
(IMF; Tornatore, Ferrara & Schneider 2007), gas cooling from res- 
onant and fine-structure lines (Maio et al. 2007) and feedback ef- 
fects (Springel & Hernquist 2003). The runs track individual heavy 
elements (e.g. C, O, Si, Fe, Mg, S), and the transition from the 
metal-free PopIII to the metal-enriched PopII/I regime is deter- 
mined by the underlying metallicity of the medium, Z, compared 
with the critical value of Z cr u = 10 -4 Zq (see Bromm & Loeb 
2003). If Z < Z cr it, a Salpeter IMF is used, with a mass range 
100 — 500 Mq; otherwise, a standard Salpeter IMF is used in the 
mass range 0.1 - 100 Mq, and a SNII IMF is used for 8 - 40 M 
(Bromm et al. 2009; Maio et al. 201 1). 

The chemical model follows the detailed stellar evolution of 
each SPH particle. At every timestep, the abundances of differ- 
ent species are consistently derived using the lifetime function 
(Padovani & Matteucci 1993) and metallicity-dependant stellar 
yields. The yields from SNII, AGB stars, SNIa and pair instability 
supernovae have been taken from Woosley & Weaver (1995), van 
den Hoek & Groenewegen (1997), Thielemann et al. (2003) and 
Heger & Woosley (2002) respectively. Metal mixing is mimicked 
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by smoothing metallicities over the SPH kernel and pollution is 
driven by wind feedback, which causes metal spreading over ~ kpc 
scales at each epoch (Maio et al. 2011). 

Galaxies are recognized as gravitationally-bound groups of 
at least 32 total (DM+gas+star) particles by running a friends-of- 
friends (FOF) algorithm, with a comoving linking-length of 0.2 in 
units of the mean particle separation. Substructures are identified 
by using the SubFind algorithm (Springel, Yoshida & White 2001; 
Dolag et al. 2009) which discriminates between bound and non- 
bound particles. Of the galaxies identified in the simulation, we 
only use those that contain at least 10 star particles (at least 145 
total particles) in our calculations at z ~ 7. For each such well- 
resolved galaxy, we obtain the properties of all its star particles, 
including the redshift of, and mass/metallicity at formation. 

Finally, the SED of each galaxy in the simulation snapshot 
at z ~ 7 is calculated by assuming each star particle forms in a 
burst and then evolves passively. If, when a star particle forms, the 
metallicity of its parent gas particle is less than Z crit = 10 -4 Z©, 
we use the PopIII SED (Schaerer 2002); if a star particle forms 
out of a metal-enriched gas particle (Z > Z cr u), the SED is com- 
puted via the population synthesis code STARBURST 9 9 (Leitherer 
et al. 1999), using its mass, stellar metallicity, and age. The com- 
posite spectrum for each galaxy is then calculated by summing the 
SEDs of all its star particles, and the intrinsic continuum luminos- 
ity, L™*, is calculated at \ re st = 1500 A. We also self-consistently 
compute the dust mass and attenuation for each galaxy in the sim- 
ulation box assuming type II SN (SNII) to be the main dust pro- 
ducers (Maiolino et al. 2006; Stratta et al. 2007). The dust mass is 
converted into an optical depth to UV photons assuming the dust 
is made of carbonaceous grains spatially distributed as the gas (see 
Dayal et al. 2010 and Dayal & Ferrera 2012 for complete details 
of this calculation). The predicted observed UV luminosity is then 
L° bs = L* n * x f c , where f c is the fraction of continuum photons 
that escape unattenuated by dust. The intrinsic value of P (Pint ) is 
calculated by fitting a power-law through the intrinsic SED of each 
galaxy over A rest = 1500 — 3000A with predicted 'observed' p 
values derived by repeating the power-law fitting after taking into 
account the dust enrichment and applying the SN extinction curve 
(Bianchi & Schneider 2007). 

In Fig. 7 we show the final predicted 'observed' individual 
values of /? for galaxies in this simulation at z ~ 7, plotted against 
absolute UV magnitude, and compared against our best measure- 
ment of the actual (weighted) mean (P) values at z ~ 7, as given in 
column 4 of Table 1 . Clearly the predictions of this simulation are 
in good agreement with our new results in a number of ways. First, 
within the magnitude range probed by our data, the average value 
of /3 is basically exactly as predicted. The predicted intrinsic scat- 
ter in ft at these 'brighter' magnitudes is also rather small, perfectly 
consistent with the analysis presented in Section 4.3. Moreover, the 
galaxies at Muv < —17 in the simulation have moderately sub- 
solar metallicities (and hence intrinsically blue j3 values), but their 
chemical enrichment history is associated with enough dust to red- 
den the observable values to /3 values entirely consistent with our 
average results. Interestingly, this is essentially the same as one of 
the scenarios we already considered in Section 5.3 as a possible 
explanation of our observed /3 values. Finally, the simulation does 
not yield a significant /3 — Muv relation over the magnitude range 
probed in the current study, but does reinforce the expectation that 
the distribution should broaden towards substantially bluer values 
of /3 at very faint magnitudes (as a significant population of low- 
metallicity, dust-free objects finally emerges). 

The agreement shown in Fig. 7 may be fortuitous, and a full 



comparison with alternative galaxy-formation model predictions is 
beyond the scope of this paper. Nevertheless, Fig. 7 (and the associ- 
ated physical properties of the simulated galaxies) does provide an 
interesting perspective on our findings. In particular, it shows that 
an interpretation of our results in terms of only moderately sub- 
solar metallicities, coupled with modest dust reddening, is at least 
physically plausible within the magnitude range probed to date, and 
that the long-anticipated population of really metal-poor dust-free 
objects possibly lies at much fainter magnitudes than originally 
suggested by the early results of Bouwens et al. (2010b). In ad- 
dition, it serves to remind us that the theoretically-predicted scatter 
in P at these epochs is on a completely different scale to that seen 
in our raw individual galaxy P distributions, the latter being still 
utterly dominated by the impact of photometric errors (compare, 
for example, Fig. 7 and Fig. 1). Nevertheless, Fig. 7 also provides 
strong continued motivation for pushing the robust measurement of 
(P) to ever fainter galaxy luminosities in the young Universe. 

5.4 Implications for reionization 

A full analysis of the implications of the combined results of the 
UDF12 programme for our understanding of cosmic reionization 
will be presented by Roberston et al. (in preparation). This analy- 
sis will necessarily draw on the new luminosity-function measure- 
ments presented by McLure et al. (2012) and Schenker et al. (2012), 
and on the new measurements of (P) presented here, because deter- 
mining the ability of the emerging galaxy population at z — 6 — 9 
to reionize the Universe requires knowledge not only of the num- 
ber density of the galaxies, but also information on their ability to 
supply the required ionizing photons. As discussed in Section 1, 
the ability of a given galaxy to contribute to the ionization of the 
surrounding inter-galactic medium depends on its luminosity, the 
age and metallicity of its stellar population, and the escape fraction 
of ionizing photons 

Unfortunately, the relatively modest values of (P) found in the 
present study do not easily lend themselves to straightforward inter- 
pretation. In particular, Fig. 6 suggests that there is little prospect 
of using our new measurements of (/?) to set new constraints on 
the ionizing-photon escape fraction, f esc , without significant addi- 
tional information (e.g. a meaningful estimate of the contribution 
of nebular emission lines to the rest-frame optical colours, as mea- 
sured by Spitzer IRAC; e.g., Stark et al. 2012; Labbe et al. 2012). 

On the other hand, the degeneracy between metallicity and 
dust obscuration is somewhat less of an issue for reionization calcu- 
lations than for calculations of cosmic star-formation rate density, 
as the former concerned only with the UV photons which survive 
to exit a galaxy and potentially contribute to cosmic reionization. 
Moreover, our finding that (P) remains close to P = —2 at z ~ 7, 8 
and even z ~ 9 suggests that the galaxies detected to date already 
contain relatively mature, metal-enriched stellar populations, lend- 
ing support to a picture in which star-formation (and hence cosmic 
reionization) commenced at significantly higher redshifts. 



6 CONCLUSIONS 

We have used the new ultra-deep, near-infrared imaging of the 
HUDF provided by our UDF12 HST WFC3/IR imaging campaign 
to explore the rest-frame UV properties of galaxies at redshifts 
z > 6.5. In this study we have exploited the final multi-band 
WFC3/IR imaging (UDF12+UDF09) to select deeper and more re- 
liable galaxy samples at z ~ 7, z ~ 8, and z ~ 9, and to provide 
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improved photometric redshifts. Most importantly, we have used 
the enhanced dataset to provide more accurate photometry, and to 
base galaxy selection primarily on the new J140 imaging, which 
exerts minimal influence on the derivation of the UV spectral index 
j3 (fx oc A' 3 ) from either J125 — -Hieo colour (e.g. Bouwens et 
al. 2010b; Dunlop et al. 2012), or J125 + J140 + J^ieo power-law 
fitting (as advocated by Rogers et al. 2012). Our main results are as 
follows. 

i) We have produced the first robust and unbiased measurement of 
the average UV power-law index, (/3), (fx oc A 8 ) for faint galax- 
ies at z ~ 7, finding = —2.1 ± 0.2 at z ~ 7 for galaxies with 
Muv — —18. This result means that the faintest galaxies uncov- 
ered to date at this epoch have, on average, UV colours no more 
extreme than those displayed by the bluest star-forming galaxies 
found in the low-redshift Universe. 

ii) We have made the first meaningful measurements of (/?) at 
z ~ 8, finding a similar value, (/?) = — 1.9 ± 0.3. 

iii) We have offered a tentative first estimate of (/3) at z ~ 9 (based 
on the J140 — Z/iso colours of the six galaxies in the redshift range 
8.5 < z < 10 reported in Ellis et al. 2012), and find (fi) = -1.8± 
0.6, essentially unchanged from z ~ 6 — 7 (albeit highly uncertain). 

iv) Finally, we have used careful end-to-end source injec- 
tion+retrieval+analysis simulations to quantify any small residual 
biases in our measurements, and to test for any evidence for sig- 
nificant intrinsic scatter in the /3 values displayed by the galaxies 
in the faintest luminosity bin which we can study at z ~ 7. While 
models including a range of /3 values provide a modestly-improved 
description of the data, we find that there is, as yet, no evidence 
for a significant intrinsic scatter in (3 within our new z ~ 7 galaxy 
sample. 

Our results exclude the possibility that even our faintest galaxy 
samples contain a substantial population of very low-metallicity, 
dust-free objects with ft ~ —3. Rather, our findings are most 
easily explained by a population of steadily star-forming galax- 
ies with either ~ solar metallicity and zero dust, or moderately 
sub-solar (~ 10 — 20%) metallicity with modest dust obscura- 
tion (Av — 0.1 — 0.2). This latter interpretation is consistent 
with the predictions of a state-of-the-art galaxy-formation simula- 
tion, which also suggests that a significant population of very-low 
metallicity, dust-free galaxies with /? ~ —2.5 may not emerge un- 
til Muv > —16, a regime likely to remain inaccessible until the 
JWST 
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